summarization method
Chain of Summaries: Summarization Through Iterative Questioning
Brach, William, Poech, Lukas Galke
Large Language Models (LLMs) are increasingly using external web content. However, much of this content is not easily digestible by LLMs due to LLM-unfriendly formats and limitations of context length. To address this issue, we propose a method for generating general-purpose, information-dense summaries that act as plain-text repositories of web content. Inspired by Hegel's dialectical method, our approach, denoted as Chain of Summaries (CoS), iteratively refines an initial summary (thesis) by identifying its limitations through questioning (antithesis), leading to a general-purpose summary (synthesis) that can satisfy current and anticipate future information needs. Experiments on the TriviaQA, TruthfulQA, and SQUAD datasets demonstrate that CoS outperforms zero-shot LLM baselines by up to 66% and specialized summarization methods such as BRIO and PEGASUS by up to 27%. CoS-generated summaries yield higher Q&A performance compared to the source content, while requiring substantially fewer tokens and being agnostic to the specific downstream LLM. CoS thus resembles an appealing option for website maintainers to make their content more accessible for LLMs, while retaining possibilities for human oversight.
SD-VSum: A Method and Dataset for Script-Driven Video Summarization
Mylonas, Manolis, Apostolidis, Evlampios, Mezaris, Vasileios
In this work, we introduce the task of script-driven video summarization, which aims to produce a summary of the full-length video by selecting the parts that are most relevant to a user-provided script outlining the visual content of the desired summary. Following, we extend a recently-introduced large-scale dataset for generic video summarization (VideoXum) by producing natural language descriptions of the different human-annotated summaries that are available per video. In this way we make it compatible with the introduced task, since the available triplets of ``video, summary and summary description'' can be used for training a method that is able to produce different summaries for a given video, driven by the provided script about the content of each summary. Finally, we develop a new network architecture for script-driven video summarization (SD-VSum), that employs a cross-modal attention mechanism for aligning and fusing information from the visual and text modalities. Our experimental evaluations demonstrate the advanced performance of SD-VSum against SOTA approaches for query-driven and generic (unimodal and multimodal) summarization from the literature, and document its capacity to produce video summaries that are adapted to each user's needs about their content.
Fair Summarization: Bridging Quality and Diversity in Extractive Summaries
Nezhad, Sina Bagheri, Bandyapadhyay, Sayan, Agrawal, Ameeta
Fairness in multi-document summarization of user-generated content remains a critical challenge in natural language processing (NLP). Existing summarization methods often fail to ensure equitable representation across different social groups, leading to biased outputs. In this paper, we introduce two novel methods for fair extractive summarization: FairExtract, a clustering-based approach, and FairGPT, which leverages GPT-3.5-turbo with fairness constraints. We evaluate these methods using Divsumm summarization dataset of White-aligned, Hispanic, and African-American dialect tweets and compare them against relevant baselines. The results obtained using a comprehensive set of summarization quality metrics such as SUPERT, BLANC, SummaQA, BARTScore, and UniEval, as well as a fairness metric F, demonstrate that FairExtract and FairGPT achieve superior fairness while maintaining competitive summarization quality. Additionally, we introduce composite metrics (e.g., SUPERT+F, BLANC+F) that integrate quality and fairness into a single evaluation framework, offering a more nuanced understanding of the trade-offs between these objectives. This work highlights the importance of fairness in summarization and sets a benchmark for future research in fairness-aware NLP models.
Summarization of Investment Reports Using Pre-trained Model
Sakaji, Hiroki, Kobayashi, Ryotaro, Izumi, Kiyoshi, Mitsugi, Hiroyuki, Kuramoto, Wataru
In this paper, we attempt to summarize monthly reports as investment reports. Fund managers have a wide range of tasks, one of which is the preparation of investment reports. In addition to preparing monthly reports on fund management, fund managers prepare management reports that summarize these monthly reports every six months or once a year. The preparation of fund reports is a labor-intensive and time-consuming task. Therefore, in this paper, we tackle investment summarization from monthly reports using transformer-based models. There are two main types of summarization methods: extractive summarization and abstractive summarization, and this study constructs both methods and examines which is more useful in summarizing investment reports.
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews
Darrin, Maxime, Arous, Ines, Piantanida, Pablo, Cheung, Jackie CK
Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to conferences has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, \sys extracts both common and unique opinions from the reviews. We introduce novel uniqueness scores based on the Rational Speech Act framework to identify relevant sentences in the reviews. Our method aims to provide a pragmatic glimpse into all reviews, offering a balanced perspective on their opinions. Our experimental results with both automatic metrics and human evaluation show that \sys generates more discriminative summaries than baseline methods in terms of human evaluation while achieving comparable performance with these methods in terms of automatic metrics.
Towards Efficient Patient Recruitment for Clinical Trials: Application of a Prompt-Based Learning Model
Rahmanian, Mojdeh, Fakhrahmad, Seyed Mostafa, Mousavi, Seyedeh Zahra
Objective: Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants. Although leveraging electronic health records (EHR) for recruitment has gained popularity, the complex nature of unstructured medical texts presents challenges in efficiently identifying participants. Natural Language Processing (NLP) techniques have emerged as a solution with a recent focus on transformer models. In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR. Methods: To process the medical records, we selected the most related sentences of the records to the eligibility criteria needed for the trial. The SNOMED CT concepts related to each eligibility criterion were collected. Medical records were also annotated with MedCAT based on the SNOMED CT ontology. Annotated sentences including concepts matched with the criteria-relevant terms were extracted. A prompt-based large language model (Generative Pre-trained Transformer (GPT) in this study) was then used with the extracted sentences as the training set. To assess its effectiveness, we evaluated the model's performance using the dataset from the 2018 n2c2 challenge, which aimed to classify medical records of 311 patients based on 13 eligibility criteria through NLP techniques. Results: Our proposed model showed the overall micro and macro F measures of 0.9061 and 0.8060 which were among the highest scores achieved by the experiments performed with this dataset. Conclusion: The application of a prompt-based large language model in this study to classify patients based on eligibility criteria received promising scores. Besides, we proposed a method of extractive summarization with the aid of SNOMED CT ontology that can be also applied to other medical texts.
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts
Kawamura, Kazuki, Rekimoto, Jun
Quickly understanding lengthy lecture videos is essential for learners with limited time and interest in various topics to improve their learning efficiency. To this end, video summarization has been actively researched to enable users to view only important scenes from a video. However, these studies focus on either the visual or audio information of a video and extract important segments in the video. Therefore, there is a risk of missing important information when both the teacher's speech and visual information on the blackboard or slides are important, such as in a lecture video. To tackle this issue, we propose FastPerson, a video summarization approach that considers both the visual and auditory information in lecture videos. FastPerson creates summary videos by utilizing audio transcriptions along with on-screen images and text, minimizing the risk of overlooking crucial information for learners. Further, it provides a feature that allows learners to switch between the summary and original videos for each chapter of the video, enabling them to adjust the pace of learning based on their interests and level of understanding. We conducted an evaluation with 40 participants to assess the effectiveness of our method and confirmed that it reduced viewing time by 53\% at the same level of comprehension as that when using traditional video playback methods.
Low-Resource Court Judgment Summarization for Common Law Systems
Liu, Shuaiqi, Cao, Jiannong, Li, Yicong, Yang, Ruosong, Wen, Zhiyuan
Common law courts need to refer to similar precedents' judgments to inform their current decisions. Generating high-quality summaries of court judgment documents can facilitate legal practitioners to efficiently review previous cases and assist the general public in accessing how the courts operate and how the law is applied. Previous court judgment summarization research focuses on civil law or a particular jurisdiction's judgments. However, judges can refer to the judgments from all common law jurisdictions. Current summarization datasets are insufficient to satisfy the demands of summarizing precedents across multiple jurisdictions, especially when labeled data are scarce for many jurisdictions. To address the lack of datasets, we present CLSum, the first dataset for summarizing multi-jurisdictional common law court judgment documents. Besides, this is the first court judgment summarization work adopting large language models (LLMs) in data augmentation, summary generation, and evaluation. Specifically, we design an LLM-based data augmentation method incorporating legal knowledge. We also propose a legal knowledge enhanced evaluation metric based on LLM to assess the quality of generated judgment summaries. Our experimental results verify that the LLM-based summarization methods can perform well in the few-shot and zero-shot settings. Our LLM-based data augmentation method can mitigate the impact of low data resources. Furthermore, we carry out comprehensive comparative experiments to find essential model components and settings that are capable of enhancing summarization performance.
RankSum An unsupervised extractive text summarization based on rank fusion
Joshi, A., Fidalgo, E., Alegre, E., Alaiz-Rodriguez, R.
In this paper, we propose Ranksum, an approach for extractive text summarization of single documents based on the rank fusion of four multi-dimensional sentence features extracted for each sentence: topic information, semantic content, significant keywords, and position. The Ranksum obtains the sentence saliency rankings corresponding to each feature in an unsupervised way followed by the weighted fusion of the four scores to rank the sentences according to their significance. The scores are generated in completely unsupervised way, and a labeled document set is required to learn the fusion weights. Since we found that the fusion weights can generalize to other datasets, we consider the Ranksum as an unsupervised approach. To determine topic rank, we employ probabilistic topic models whereas semantic information is captured using sentence embeddings. To derive rankings using sentence embeddings, we utilize Siamese networks to produce abstractive sentence representation and then we formulate a novel strategy to arrange them in their order of importance. A graph-based strategy is applied to find the significant keywords and related sentence rankings in the document. We also formulate a sentence novelty measure based on bigrams, trigrams, and sentence embeddings to eliminate redundant sentences from the summary. The ranks of all the sentences computed for each feature are finally fused to get the final score for each sentence in the document. We evaluate our approach on publicly available summarization datasets CNN/DailyMail and DUC 2002. Experimental results show that our approach outperforms other existing state-of-the-art summarization methods.
Beyond the Frame: Single and mutilple video summarization method with user-defined length
Kalkhorani, Vahid Ahmadi, Zhang, Qingquan, Song, Guanqun, Zhu, Ting
Video smmarization is a crucial method to reduce the time of videos which reduces the spent time to watch/review a long video. This apporach has became more important as the amount of publisehed video is increasing everyday. A single or multiple videos can be summarized into a relatively short video using various of techniques from multimodal audio-visual techniques, to natural language processing approaches. Audiovisual techniques may be used to recognize significant visual events and pick the most important parts, while NLP techniques can be used to evaluate the audio transcript and extract the main sentences (timestamps) and corresponding video frames from the original video. Another approach is to use the best of both domain. Meaning that we can use audio-visual cues as well as video transcript to extract and summarize the video. In this paper, we combine a variety of NLP techniques (extractive and contect-based summarizers) with video processing techniques to convert a long video into a single relatively short video. We design this toll in a way that user can specify the relative length of the summarized video. We have also explored ways of summarizing and concatenating multiple videos into a single short video which will help having most important concepts from the same subject in a single short video. Out approach shows that video summarizing is a difficult but significant work, with substantial potential for further research and development, and it is possible thanks to the development of NLP models.